Home:ALL Converter>Hadoop distcp to AWS s3 bucket

Hadoop distcp to AWS s3 bucket

Ask Time:2018-11-14T16:58:25         Author:akr

Json Formatter

I am trying the below command to transfer files form hdfs to aws s3 bucket.

Below is the code

  hadoop distcp \
  -Dfs.s3a.access.key=AKIXXXXXXXX4C7GA \
  -Dfs.s3a.secret.key=N12XXXXXXXXary24OXPt \
  -Dfs.s3a.fast.upload=true \
  hdfs://qa/user/dev_test/KL/TEST.csv s3a://Cust- 
   import/dcp/ua.10456754/119XXXX079

I am getting the below timeout error.

18/11/14 00:47:45 INFO http.AmazonHttpClient: Unable to execute HTTP request: Connect to Cust-import.s3.amazonaws.com:443 timed out com.cloudera.org.apache.http.conn.ConnectTimeoutException: Connect to optimizely-import.s3.amazonaws.com:443 timed out at com.cloudera.org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:416) at com.cloudera.com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:128) at com.cloudera.org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at com.cloudera.org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at com.cloudera.org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:643) at com.cloudera.org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479) at com.cloudera.org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at com.cloudera.org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at com.cloudera.com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728) at com.cloudera.com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489) at com.cloudera.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310) at com.cloudera.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785) at com.cloudera.com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1107) at com.cloudera.com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1070) at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:312) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:260) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2815) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2852) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2834) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:205) at org.apache.hadoop.tools.DistCp.run(DistCp.java:131) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.tools.DistCp.main(DistCp.java:441)

Author:akr,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/53296304/hadoop-distcp-to-aws-s3-bucket
yy